home *** CD-ROM | disk | FTP | other *** search
- Path: mayne.ugrad.cs.ubc.ca!not-for-mail
- From: c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku)
- Newsgroups: comp.lang.c,comp.unix.programmer
- Subject: Re: Q: '\n' character
- Date: 15 Apr 1996 16:33:14 -0700
- Organization: Computer Science, University of B.C., Vancouver, B.C., Canada
- Message-ID: <4kumbqINNgcr@mayne.ugrad.cs.ubc.ca>
- References: <4kj66f$k0o@ren.cei.net> <AD97189A966891F2@mcdiala02.it.luc.edu> <4ktn04INNoev@keats.ugrad.cs.ubc.ca> <4ku8f9$d3o@mark.ucdavis.edu>
- NNTP-Posting-Host: mayne.ugrad.cs.ubc.ca
-
- In article <4ku8f9$d3o@mark.ucdavis.edu>,
- James Knight <knight@quad.cs.ucdavis.edu> wrote:
-
- This is a good effort: I will try to look for any marginal improvements.
-
-
- >/*
- > * my_getline
- > *
- > * Read a line of any length, store it in an internal buffer, and
- > * return the internal buffer (along with a length value if desired).
- > *
- > * NOTE: Each line read will overwrite the previous line read. So,
- > * make a copy of any line you want to keep around.
- > *
- > * Parameters:
- > * fp - A FILE pointer open for reading.
- > * len_out - Address to where to store the line length.
- > *
- > * Returns:
- > * An internal buffer containing the line, or NULL on EOF or error.
- > */
- >char *my_getline(FILE *fp, int *len_out)
- >{
- > static int bufsize = 0;
- > static char *buffer = NULL;
- > int size, len, flag;
- >
- > /*
- > * Initialize the internal buffer, if necessary.
- > */
- > if (buffer == NULL) {
- > bufsize = 128;
- > if ((buffer = malloc(bufsize)) == NULL)
- > return NULL;
- > }
- >
- > /*
- > * Read the first part of the line.
- > */
- > flag = 0;
- > buffer[bufsize-2] = '\0';
- >
- > if (fgets(buffer, bufsize, fp) == NULL)
- > return NULL;
- > else if (buffer[bufsize-2] == '\0' || buffer[bufsize-2] == '\n') {
- > len = strlen(buffer);
- > flag = 1;
- > }
- >
- > /*
- > * If the line is longer, then realloc the internal buffer and
- > * read the next section of the line.
- > */
- > while (!flag) {
- > size = bufsize - 1;
- > bufsize += bufsize;
-
- That's excellent! O(log n) reallocs with respect to line length. This exceeds
- my expectation for a linear reallocation strategy. An increase in size that
- follows the fibonacci sequence might be good as well.
-
- > if ((buffer = realloc(buffer, bufsize)) == NULL)
- > return NULL;
-
- Just one quip: when realloc() fails, the original data is not lost. So you have
- to keep the original pointer around, and be ready to either leave the data as
- it is, or free() it.
-
- > buffer[bufsize-2] = '\0';
- > if (fgets(buffer + size, bufsize - size, fp) == NULL) {
- > len = size;
- > flag = 1;
- > }
- > else if (!buffer[bufsize-2] || buffer[bufsize-2] == '\n') {
- > len = size + strlen(buffer + size);
- > flag = 1;
- > }
- > }
- >
- > /*
- > * Strip the newline from the line, if it's there.
- > */
- > if (buffer[len-1] == '\n')
- > buffer[--len] = '\0';
- >
- > if (len_out) *len_out = len;
- > return buffer;
-
- Ah, you forgot to adjust the buffer size for the actual length read! If the
- line is 128K plus one byte, you will return 256K---the next higher power of
- two. No biggie, but it's easy to fix with a realloc down to the actual size.
- I'm not sure how paranoid one ought to be when checking the result of a
- _shrinking_ realloc, but I'd treat it the same was as a growing one to be safe.
-
- >}
- >
-
- Nevertheless, my original comments about fgets() versus loops that use getc()
- apply: the above might be quite a bit cleaner if you were content to spoon out
- a character at a time. No strlen(), no odd buffer manipulation---just a pointer
- that you advance, and check against the buffer bounds.
-
- What about dealing with null characters in the input lines?
-
- --
- I'm not really a jerk, but I play one on Usenet.
-